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Abstract 

By  exploiting  the  relation  of  the  QL  algorithm  to  inverse  iteration 
we  obtain  a proof  of  global  convergence  which  Is  more  conceptual  and  less 
computational  than  previous  analyses.  The  proof  uses  a new,  but  simple, 
error  estimate  for  the  first  step  of  inverse  Iteration. 
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1 . Introduction 

The  QR  Algorithm  has  become  the  preferred  method  for  finding  the 
eigenvalues  of  a given  matrix,  symmetric  or  non symmetric.  One  of  the  high 
points  in  the  field  of  matrix  computations  is  Wilkinson's  discovery 
[Wilkinson,  19681  that  the  algorithm,  when  used  with  the  proper  shift 
strategy,  converges  for  all  symmetric,  tridiagonal  matrices.  This  result 
permits  us  to  write  clean  efficient  programs  for  computing  these  eigenvalues 
there  is  no  need  for  routine  checking  for  rare  unacceptable  cases.  The 
excellent  asymptotic  convergence  rate  for  the  method  was  already  known. 

Each  iteration  in  the  algorithm  is  effected  by  making  a sequence  of 
specially  chosen  plane  rotations.  Wilkinson's  proof  is  based  on  a careful 
scrutiny  of  the  last  three  of  these  rotations  and  a rather  complicated  com- 
putation is  involved.  A careful,  detailed  exposition  of  the  proof  can  be 
found  in  [Lawson  and  Hanson,  1974,  Appendix  R[. 

The  result  is  so  nice  that  one  is  tempted  to  seek  a proof  which  does 
not  require  explicit  formulae  for  the  elements  of  the  next  matrix  in  the 
QR  sequence.  The  one  presented  here  abandons  the  plane  rotations  in  favor 
of  the  relation  of  the  QR  algorithm  to  inverse  iteration,  see  for  instance 
[Parlett  and  Poole,  1973) . The  discussion  is  in  terms  of  the  QL  algorithm 
which  is  a convenient  variation  of  the  original  QR  algorithm.  Section  3 
gives  more  details. 

We  try  to  adhere  to  the  standard  notational  conventions:  lower  case 
roman  letters  for  column  vectors,  lower  case  gre^k  letters  for  scalars  (all 
real  here),  and  upper  case  roman  letters  for  matrices  (reserving  symmetric 
letters  for  symmetric  matrices).  We  write  7T  for  the  transpose  of  7, 

I for  (e^ .e^,. . . ,e  ),  and  A-\  for  A - \ I . All  matrices  are  n ' n unless 
the  contrary  is  stated,  D x n * *x^x  and  we  write  trl diagonal  matrices  A 


as  shown  below: 
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The  QL  transformation,  with  shift  0,  transforms  symmetric  tridiagonal 

A into  symmetric  tridiagonal  A * QTAQ  where  Q = 0n)  is 

orthogonal  and  depends  on  a. 

For  the  busy  reader  who  is  familiar  with  the  subject  we  present  a 
brief  outline  of  the  argument  now.  One  special  piece  of  notation  is  needed: 
MP^  denotes  the  set  of  all  monic  polynomials  (leading  coefficient  1)  of 
degree  k.  We  observe  that 


|B,B?)  = min  l<t>(A)q,  II  , 
1 c <f><EMP2  ' 

< |(A-a1)(A-a)q1l, 
= II  (A-a^  )e^ r U 

= | B1 1 x 

< |a-|-o|  - 1 B2I 

f.  I ^1  ^2 1 * 


Lanczos , 

the  artful  choice, 

the  connection  with  inverse  iteration.  Lemma  2, 

since  A Is  tridiagonal, 

if  0 is  Wilkinson's  shift,  Lemma  4, 

by  a characteristic  property  of 

Wilkinson's  shift. 


Only  the  strict  inequality  ts  really  new  and  a sharper  form 

of  it  is  used  in  Sections  5 and  6 to  show  that  < 

< (2/5) (B^-1  ^B^"1  for  all  k and  also  that  < IB^I/*#-  This 

(kl 

establishes  global  convergence,  i.e.  B|  ' -*•  0,  in  a clean  way. 
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2 • Orthogonal  Reduction  to  Tri diagonal  Form 
Any  symmetric  matrix  M may  be  reduced  to  tri diagonal  form  A by  an 
orthognoal  similarity  transformation.  In  symbols 

(0)  A = GTMG  , I = GTG  = GGT  . 


In  fact,  when  the  off  diagonal  elements  Bj  are  not  zero  then  A is 
completely  determined  by  g^  (or  by  gn).  Our  interest  is  in  expressions 
for  products  of  the  B . , j = 1,2,...  . From  the  pioneering  work  |Lanczos, 

J 

1950|  we  can  deduce  that 

1 3-j  * * * | ■ m1nll<f>(M)g1  H 

over  all  ntonic  polynomials  4>  of  degree  j with  equality  only  when  ^(\) 
is  the  leading  principal  j*j  minor  of  X-A. 

However  we  prefer  to  use  some  alternative  formulas  which  yield  rather 
more  information  and  are  also  quite  well  known. 

A useful  way  of  understanding  the  relationships  hidden  in  (0)  is  to 
equate  columns  on  each  side  of  the  equation 


GA  = MG 


and  deduce  that  the  columns  {g, ,g?, . . . ,g  .1  form  an  orthonormal  basis  for 
the  so-called  Krylov  subspace  K.  of  Fn  which  is  spanned  by 

J 


g-j  »Mg.j  ,M  g -j  > . . . « M g-j  . 


Let  Pj  denote  the  orthogonal  projection  of  F onto  K.  and  let  P.  be 
its  complement.  For  example,  P1  = g^gj,  P2  = + • 


LEMMA  1.  Let  GA  = MG  with  G = (g^,...,gn)  orthogonal,  then 

g2^  = p^  . 

<33^2^1  = • 


Proof.  By  equating  the  (1,1),  (1,2),  and  (2,2)  elements  on  each  side 
of  (1 ) we  find 

(2)  d-j  = g-jMg-j  , B-j  = g^g2  = 92Mg-j  , a2  - 92Mg2  • 

Now  equate  first  columns  on  each  side  of  the  equation  GA  = MG  and  rearrange 

9261  = M9!  “ 9]“!  » 

= Mg1  - g1(gjMg1)  , using  (2)  , 

= (i-g1g{)Mg1  , 

= P-jMg-j  • 

Next  equate  the  second  columns  on  each  side  and  rearrange: 

93 &2  = Mg 2 " 92^2  ■ 9}^  » 

* Mg2  - 92(9^2)  - g1(g{Mg2)  , using  (2)  , 

(^)  T J 

= (i-g292-g1g1 )Mg2  . 

= P2Mg2  • 

Multiply  (4)  by  0,  and  use  (3)  to  obtain  the  formulas  in  the  lemma.  □ 

In  the  next  section  we  will  apply  this  lemma  to  the  case  when  M = A 
is  also  tridiagonal  and  Mg.  lies  in  the  plane  of  g.  and  e,. 
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3.  The  QL  Transform  and  Inverse  Iteration 
The  QL  transform  of  A is  denoted  by  A,  has  the  same  form  as  A, 
and  is  defined  by 

(1)  A » QTAQ 

where  Q is  the  orthogonal  matrix  which  satisfies 

(2)  A - o 3 QL 

and  L is  lower  triangular  with  positive  diagonal  elements.  The  scalar  o 
is  called  the  shift.  Note  that  Q is  the  result  of  performing  the  Gram- 
Schmidt  orthonormal i zing  process  to  the  columns  of  A-o  from  right  to  left. 

The  QL  algorithm  iterates  the  QL  transform,  choosing  an  appropriate  shift 
at  each  step. 

The  QL  transform  is  related  to  the  earlier  QR  transform  in  a very  simple  way 

if  I = (e  e e,)  and  A is  the  QL  transform  of  A then  iAl  is 

n*  n-l  i 

the  QR  transform  of  IAI.  The  QL  algorithm  has  some  minor  advantages  from 
the  programner' s point  of  view  and  has  become  the  preferred  method.  Conse- 
quently we  will  present  our  results  in  its  terminology. 

In  practice  the  matrix  Q which  turns  A into  A is  never  formed 
explicitly.  Even  in  then  the  columns  of  Q are  determined  in  the  order 

qn,qn  i q2’V  Nevertheless  A is  completely  determined  by  q1  and 

q^  connects  the  QL  transformation  with  simpler  processes  like  inverse 
iteration. 

We  are  now  going  to  formulate  a result  which  is  quite  well  known. 


LEMMA  2.  Let  Q^AQ  * A be  the  QL  transform  of  unreduced  tridia- 
gonal A unth  real  shift  a.  Then  = Qe^  satisfies 

(A-o)q1  = e^T  . 

If  o is  an  eigenvalue  of  A then  t = 0;  otherwise  t is  the 
scale  factor  whioh  ensures  that  lq-|  I *1;  so  i - 1/1 (A-o)"1e1 1 . 


Proof.  Transpose  equation  (2)  above,  post  multiply  by  Q,  and  use 
QTQ  * I to  find 

(3)  (A-o)Q  * lT  . 

Equating  column  1 on  each  side  shows  that 

(4)  (A-o)q1  = e^  , lu  > 0 . 

If  o is  not  an  eigenvalue  then 

(5)  1 = I q-| I * l(A-o)‘1e1l»t11 

and  we  have  written  t for  If  o is  an  eigenvalue  then 

0 = det(A-cr)  = det  Q • det  L . 


The  Gram-Schmidt  process  begins  with  q„  = (A-o)e  /t  . Because  A is 

n n nn 

unreduced  i.  = l(A-o)e  I t 0.  Moreover,  for  the  same  reason,  the  last 
nn  n 

(n-1)  columns  of  A-o  are  linearly  independent.  Consequently 


ljj  > ° * 


n,n-l .... ,3,2  , 


for  all  o.  It  follows  that  on  the  last  step  of  the  Gram-Schmidt  process 
a null  vector  is  obtained.  Hence  = t ■ 0 and  q1  may  be  any  unit 


vector  orthogonal  to  all  the  other  q’s.  This  gives  only  a choice  of  sign 
for  q^  and  in  either  case  (A-o)g1  “0.  □ 


Equation  (4)  shows  that  the  first  column  of  Q is  the  normalized 
result  of  one  step  of  inverse  iteration  with  shift  o. 

We  now  use  lemma  2 to  get  expressions  for  the  off  diagonal  elements 
which  are  produced  in  the  course  of  the  QL  algorithm. 


LEMMA  3.  let  A Q AQ  be  the  QL  tw»»8/V*m  e?  A with  real  ehi;'t 


1^1  a 1 1 sin  9^  | 

- T 1 81  sin  , 

iv  9^  ie  the  angle  between  and  the  Knit  lev  e:\tee  K.., 

i • 1,2. 


Proof.  Recall  that  in  Lenina  1 P^  1 I - q^qj,  P0  = I - q^qj  - q2q2- 


We  have 


q2^  * P]Aq1 

■ f*1  (q^o  + e^  r) 


Lenina  1 


Lenina  2 


■ piv 


Further 


* t(e1  - q1  cos  9^ ) 


q3^1^2  * ^^1  Aql  ’ Len,1'a  * 

* iP2A(e1  - q1  cos’t^ ) , two  lines  up, 

« xP^e,^  + e^  - q1  cos  9^)  , A is  tridiagonal, 

■ nS^2e?  * annih11ates  K2  3 span(ql,Aq1) 

* span(q.j  ,e^ ) . 


On  taking  norms  the  results  follow.  □ 


i 
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Using  the  same  technique  it  can  be  shown  that 

Wh  = BlB2'"6j-l  1 S1n0j  ' 

Lemma  3 holds  for  any  shift  strategy  but  the  global  convergence  follows 
from  a simple  bound  on  r which  holds  when  Wilkinson's  shift  is  used. 


4.  Wilkinson's  Shift 

al  B1 

Given  A then  Wilkinson's  shift  u>  is  that  eigenvalue  of  J 

B1  a2 

which  is  closer  to  c^.  In  case  of  a tie  either  eigenvalue  may  be  used. 


So  we  have 


and 


(a-|-u>)(a2-io)  - B-|  = 0 


la1_aJl  £ 1 <*1  -to  * | . 


Let  us  write  6 = (a2-a1)/2  and  observe  that 

<*>,  (d’  = (a1+a2)/2  ± 


This  shows  that 


Ictj-wl  <_  I Ot2-Oil 


with  equality  if,  and  only  if,  6=0.  By  noting  that  |B^ 
mean  of  | ct^ -io ( and  |a2-u)|  we  have 


is  the  geometric 


| ot-j  -co  | B-|  / a|-u 

T,  |a2-a)j  ~ V a2_‘° 


< 1 


with  equality  if,  and  only  if, 


6 = 0. 
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5 . A Residual  Estimate  for  Inverse  Iteration 
Since  A is  symmetric  and  tridiagonal  we  know  that  when  is  small 
compared  with  lo^-o^l  then  e1  Is  a good  approximation  to  an  eigenvector 
and  Wilkinson's  shift  ai  is  an  even  better  eigenvalue  approximation  than 

a.|.  A well  known  way  to  obtain  an  improved  normalized  eigenvector  is  to 

solve  for  q1  the  equation 

(1)  (A-ui)q1  = e^T 

where  t is  the  positive  scale  factor  which  ensures  that  II q^ II  = 1. 

Our  concern  here  is  at  the  opposite  extreme.  If  is  not  necessarily 

small  and  e^  is  a poor  approximation  to  an  eigenvector  of  A how  bad  can 
(m.q^)  be  as  an  approximate  eigenpair?  A good  measure  for  this  approxima- 
tion is 

t/IAI  , 

which  is  the  norm  of  the  "residual"  vector  (A-w)q^  relative  to  PAH. 

We  now  show  that  (w.q^  cannot  be  arbitrarily  bad;  in  fact  t < | | . 
For  convenience  we  write 

a..  = a,j  - a) 

and  define  p = (tt^  by 

(2)  (A-w)p  = e]  . 


LEMMA  4.  When  Wilkinson's  shift  w is  used  in  (1)  then 


1 + B1 

^1*4  44 


-1  oi.j  1^2 


P 


1 


Proof.  If  u«  Is  an  eigenvalue  of  A then  the  Ql  transform  will  make 
q^  an  eigenvector,  so  t * 0.  From  now  on  assume  that  A-u>  Is  invertible. 
From  (1)  and  (2)  we  have 

^ - P/lpl 
and 

(3)  t2  - 1/lpl2  < l/(n2+n2+n2)  . 

The  first  two  equations  In  (2)  are 

(4)  ♦ l^it?  * 1 , 

(5)  RjHj  + lV3  * 0 ‘ 

Recall  the  definition  of  w and  form  (Rj/iij ) * (4)  - (5)  to  find 

(6)  0 *■  0 - 02*3  * . 

In  fact  (6),  together  with  the  fact,  from  (4).  that  and 

cannot  vanish  simultaneously.  Is  sufficient  to  prove  that  t < l^l* 

? ? 

However,  we  can  easily  bound  11 1 ti ^ away  from  0.  Ry  elementary  geometry 
the  distance  of  the  origin  from  the  line  (4)  In  the  n^,n9  plane  Is 
1/roT+R'.  Hence 

(7)  n2  * «2  1 ) 

and  the  result  follows  readily  from  using  (6)  and  (7)  In  (3).  H 

The  surprisingly  simple  expression  (6)  for  ensures,  by  Itself,  that 
,s  awnotone  decreasing.  The  extra  Information  contained  In  (7)  shows 
that  the  decrease  Is  linear  right  from  the  start, 

Lenina  4 gives  more  Information  than  we  need.  To  simplify  later  discus 
slon  we  use  the  harmonic  mean,  defined  for  positive  t,  n by 

H(t.n)  2/U'W1)  . 


■^i  iiiwiiij- 
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On  majorizing  |i^|  by  fl,  Lemma  4 simplifies  to  the  useful 
["  COROLLARY.  t2  < H(H2.Jii2) 


6.  Global  Convergence  of  the  Ql  Algorithm 
The  QL  algorithm  produces  a sequence  of  unreduced  symmetric  tridlaqonal 

(i\ 

matrices  Av  k = 1,2,...  and  the  glorious  fact  Is  that,  always, 

(kl  (kl 

♦ 0 rapidly  as  k » *>,  revealing  ot|  ' as  an  increasingly  good 

approximation  to  an  eigenvalue  of  A^\  When  Is  accepted  as 

negligible  the  algorithm  continues  to  transform  all  but  the  first  row  and 
(kl 

column  of  A'  ' and  thus  all  the  eigenvalues  may  be  found  In  turn. 

( kl 

The  convergence  of  |i<j  ' | need  not  be  monotonic  but  the  kev  fact  is 
(kl  (kl 

that  (|g|  | , k»l,2,...l  Is  monotone  decreasing  and  its  limit  is  0. 

Using  the  corollary  of  Lenina  4 in  Lenina  3 and  noting  that  H(\,nl  - »’;n 
we  obtain 


Wflki noon  'e  ehi’/Y  i'o  utu\( 
2 *> 


LTMMA  5. 

(al  fl2  s i2  v min(2|i‘1  ,iv,,  |t^t'?|/*?l 
(b)  (ty2)2  - H2i2  < (t^1ti2)4’n(('2/d2.j)  < (iy.,V  . 


>u  Qt  'i*;'  t hr: , 


This  establishes  the  monotonic  decline  of  but  to  see  that 

I i- 

the  limit  is  zero  it  suffices  to  consider  two  successiv  steps  in  the 
algorithm  and  so  the  superscript  k can  be  dropped. 

Lenma  5(b)  shows  that  the  reduction  in  t<, is  substantial  unless 
ji<2/i^|  is  small.  However  Lenma  5(a)  shows  that  such  an  unfortunate  ratio 


r 


!"sc!^r 


1 
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cannot  persist.  The  next  result  makes  this  precise.  We  recall  that  the 
harmonic  mean,  H(C,n),  of  positive  numbers  C and  n Is  synmetrlc,  homogeneous 
(of  degree  1)  and  monotonic  Increasing  in  each  of  Its  arguments  separately. 

* • 

THEORFM  1.  Let  A,  A,  A be  three  sueoessive  terms  in  the  QL 
se^uenoe  ut* in.}  Wilkinson's  shift.  Then 

(B,^)2  < (61B2)22/(3+A)  < (2/5)(p1B2)2  . 


°2°2  A?*2  a 

Proof.  B^Bj  < B, r , Lenina  5b  for  A, 

< B^H(B^,2-B2)  » Lenina  4 Corollary  for  A, 

= H(B^-B2B2)  , homogeneity  of  H, 

< H(t4,^B2t2)  , monotonicity  and  Lenina  5 for  A, 

2 12  2 

» t ) , homogeneity  and  symmetry  of  H, 

< H(p2,^fi|) ) • Lenr,a  4 corollary  for  A* 

* B2B2H(7j-,p"1)H(7r,H(l  ,^p))  , homogeneity  of  H,  p = B^/B2  , 

< B^B22/(3+»1>)  , maximizing  over  all  p s 0. 


We  note  that 


<*LM 


fnr1 


'•'1  p 

1/l|+  (jjp  + p'1 


)/2] 


n 


COROLLARY  1.  For  the  QL  a l gori  thin  isi  th  Wi  Iki  nson ’s  shift 
fljk^k)  — 0 ,i*  k — « . 
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Proof.  |B<2k*»42k‘'>|  < l42k)42k)|  < (2/s)k|e<1>s<1>|  . n 

The  asymptotic  convergence  rate  Is  much  better  than  this.  What  is 
remarkable  Is  that  convergence  Is  linear,  with  a good  ratio,  right  from 
the  start. 


COROLLARY  2.  For  tho  QL  algorithm  irith  Wilkhuion'r  #hi  ft 


' — ► 0 <}«♦  k — *•  «'  . 


Proof.  By  Lenina  5 ( a ' , ^ v | P2 1 Convergence  follows  fro 

Corollary  1. 
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7 . Local  Convergence 

We  suppress  the  fact  that  all  the  elements  , 8^,  etc.  depend  on  k, 
the  iteration  count.  We  know  that  as  k ->  <»  both  >0,  p^  ► 0. 

In  the  usual  case  > 0 as  well.  In  this  case  -►6/0  because 
the  eigenvalues  of  an  unreduced  tridiagonal  matrix  are  distinct  (although 
sometimes  very  close).  The  question  we  take  up  now  is  the  asymptotic 
convergence  rate.  From  Lemma  3 |(L  | * i|sin  0^|,  but  the  estimate  for  i 
in  Lerrnia  4 does  not  reflect  the  asymptotic  regime.  In  fact,  as  k ♦ «>, 


t = 1/llpi  - 0(1/1^)) 


where  (A-w)p  = e^. 

Solving  these  equations  as  before  yields 


-CCr 

L 

"3  = 


71 2 = 


12 


a2a3 

B1B2 


B31t4 


In  the  usual  regime 


I sin  0 


1 


-2- 

_ 

n2n3 

rt2P3n4 

" TT 

* S1S2 

2 . 

• H2  + • • 

2 '"27 

l-n,+TT_+- 

. . +TT  > 

.0(1^1) 


and  using  the  first  terms  in  the  expressions  for  and 


I 


i 


This  is  better  than  cubic  convergence. 
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We  have  not  been  able  to  prove  that  the  analysis  given  above  always 
obtains.  The  possibility  remains  open  that  a?  -*0,  -+  n f 0.  In  this 

case  n,  still  dominates  the  other  elements  of  tt  but  it  is  the  third 
term  in  the  above  expression  for  n,  which  brings  this  about.  Thus,  in 
such  a case 

T ~ f^/|a2|  = 1^1  , 

2 

| sin  0,)  ~ ^1'>3/P2 


8, I = o( |51 61 | ) = 0(3, ) 


Thus  guadratic  convergence  occurs  even  in  this  unstable,  and  very  special, 
eventual ity. 
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